Competition on Spatial Statistics for Large Datasets

نویسندگان

چکیده

As spatial datasets are becoming increasingly large and unwieldy, exact inference on models becomes computationally prohibitive. Various approximation methods have been proposed to reduce the computational burden. Although comprehensive reviews these exist, comparisons of their performances limited small medium sizes for a few selected methods. To achieve comparison comprising as many possible, we organized Competition Spatial Statistics Large Datasets. This competition had following novel features: (1) generated synthetic with ExaGeoStat software so that number realizations ranged from 100 thousand 1 million; (2) systematically designed data-generating represent processes wide range statistical properties both Gaussian non-Gaussian cases; (3) tasks included estimation prediction, results were assessed by multiple criteria; (4) made all publicly available serve benchmark other In this paper, disclose details along some analysis outcomes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Likelihoods for Large Spatial Datasets

Datasets in the fields of climate and environment are often very large and irregularly spaced. To model such datasets, the widely used Gaussian process models in spatial statistics face tremendous challenges due to the prohibitive computational burden. Various approximation methods have been introduced to reduce the computational cost. However, most of them rely on unrealistic assumptions of th...

متن کامل

Bayesian Modeling for Large Spatial Datasets.

We focus upon flexible Bayesian hierarchical models for scientific data available at geo-coded locations. Investigators are increasingly turning to spatial process models to analyze such datasets. These models are customarily estimated using Markov Chain Monte Carlo (MCMC) methods, which have become especially popular for spatial modeling, given their flexibility and power to fit models that wo...

متن کامل

Sparse Density Representations for Simultaneous Inference on Large Spatial Datasets

Large spatial datasets often represent a number of spatial point processes generated by distinct entities or classes of events. When crossed with covariates, such as discrete time buckets, this can quickly result in a data set with millions of individual density estimates. Applications that require simultaneous access to a substantial subset of these estimates become resource constrained when d...

متن کامل

Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets

This paper introduces new algorithms and data st.ruct,ures for quick rounting for machine learning dat.asets. We focus on t,he counting task of constructing contingent:. t.ables, but our approach is also applicahlc t.o counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptionsl t h c rosts of thesr operations ca,n he shown to be independent of the...

متن کامل

Cached Suucient Statistics for Eecient Machine Learning with Large Datasets

This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Agricultural Biological and Environmental Statistics

سال: 2021

ISSN: ['1085-7117', '1537-2693']

DOI: https://doi.org/10.1007/s13253-021-00457-z